Women philosophers weren't around for a long time (published, at least). A majority of the most famous philosophy texts that exist are written by men. Women philosophers weren't around for a long time (published, at least). A majority of the most famous philosophy texts that exist are written by men. In this report, I will compare the texts written by the men and women of philosophy, and texts written about them such as biographies and reviews. Comparison techniques used are sentiment analysis and content overlap.
The main dataset used for this analysis can be found at https://www.kaggle.com/kouroshalizadeh/history-of-philosophy. It contains over 300,000 sentences from over 50 texts spanning 10 major schools of philosophy. The represented schools are: Plato, Aristotle, Rationalism, Empiricism, German Idealism, Communism, Capitalism, Phenomenology, Continental Philosophy, and Analytic Philosophy.
Two other datasets that I source for this analysis are from Wikipedia and google search results. For Wikipedia, I create a dataset that contains each author's Wikipedia bio page levering the wikipedia python package. (Example for Simone de Beauvior: https://en.wikipedia.org/wiki/Simone_de_Beauvoir). The expectation is that author's wikipedia pages will be an objective source on authors' lives regardless of sex. The google dataset contains the content of the first 10 websites that appear from a search (Example: "philosopher Beauvior review"). The google dataset is scraped from the internet using BeautifulSoup. I expect the google dataset to be the most subjective since we are searching for reviews and the authors' opinions will be clear.
Methodologies used in this report include web scraping, sentiment analysis, and word clouds. Web scraping is the process of using automated code to extract content and data from a website, this is used to collect the google data. Sentiment analysis is the process of computationally identifying and categorizing opinions expressed in a piece of text. They are typically categorized as positive, negative, or neutral and can be used to determin the writer's attitude towards a particular topic. A word cloud is is a collection of words visualized in different sizes. The bigger and bolder the word appears, the more often it's mentioned within a text.
Data cleaning methodologies used are removing special characters, removing stopwords, and stemming. This same methodology is applied to all datasets. Removing special characters does exactly as it says - removes special characters from text. Stopwords are the English words which does not add much meaning to a sentence. They can safely be ignored without sacrificing the meaning of the sentence. Stemming is the process of reducing a word to its word stem that affixes to suffixes and prefixes or to the roots of words.
Further technical and specific explanations of these methodologies and their applications to this report can be found in ../lib/README.md.
import warnings
warnings.filterwarnings('ignore')
%run ../lib/utils.ipynb
DEPRECATION: Configuring installation scheme with distutils config files is deprecated and will no longer work in the near future. If you are using a Homebrew or Linuxbrew Python, please see discussion at https://github.com/Homebrew/homebrew-core/issues/76621 Requirement already satisfied: pandas in /usr/local/lib/python3.9/site-packages (1.3.3) Requirement already satisfied: python-dateutil>=2.7.3 in /usr/local/lib/python3.9/site-packages (from pandas) (2.8.2) Requirement already satisfied: pytz>=2017.3 in /usr/local/lib/python3.9/site-packages (from pandas) (2021.1) Requirement already satisfied: numpy>=1.17.3 in /usr/local/lib/python3.9/site-packages (from pandas) (1.21.2) Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.9/site-packages (from python-dateutil>=2.7.3->pandas) (1.16.0) DEPRECATION: Configuring installation scheme with distutils config files is deprecated and will no longer work in the near future. If you are using a Homebrew or Linuxbrew Python, please see discussion at https://github.com/Homebrew/homebrew-core/issues/76621 ERROR: Could not find a version that satisfies the requirement re (from versions: none) ERROR: No matching distribution found for re DEPRECATION: Configuring installation scheme with distutils config files is deprecated and will no longer work in the near future. If you are using a Homebrew or Linuxbrew Python, please see discussion at https://github.com/Homebrew/homebrew-core/issues/76621 Requirement already satisfied: gensim in /usr/local/lib/python3.9/site-packages (4.1.2) Requirement already satisfied: scipy>=0.18.1 in /usr/local/lib/python3.9/site-packages (from gensim) (1.7.1) Requirement already satisfied: smart-open>=1.8.1 in /usr/local/lib/python3.9/site-packages (from gensim) (5.2.1) Requirement already satisfied: numpy>=1.17.0 in /usr/local/lib/python3.9/site-packages (from gensim) (1.21.2) DEPRECATION: Configuring installation scheme with distutils config files is deprecated and will no longer work in the near future. If you are using a Homebrew or Linuxbrew Python, please see discussion at https://github.com/Homebrew/homebrew-core/issues/76621 Requirement already satisfied: bs4 in /usr/local/lib/python3.9/site-packages (0.0.1) Requirement already satisfied: beautifulsoup4 in /usr/local/lib/python3.9/site-packages (from bs4) (4.10.0) Requirement already satisfied: soupsieve>1.2 in /usr/local/lib/python3.9/site-packages (from beautifulsoup4->bs4) (2.2.1) DEPRECATION: Configuring installation scheme with distutils config files is deprecated and will no longer work in the near future. If you are using a Homebrew or Linuxbrew Python, please see discussion at https://github.com/Homebrew/homebrew-core/issues/76621 Requirement already satisfied: requests in /usr/local/lib/python3.9/site-packages (2.26.0) Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.9/site-packages (from requests) (2021.5.30) Requirement already satisfied: charset-normalizer~=2.0.0 in /usr/local/lib/python3.9/site-packages (from requests) (2.0.6) Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/local/lib/python3.9/site-packages (from requests) (1.26.7) Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.9/site-packages (from requests) (3.2) DEPRECATION: Configuring installation scheme with distutils config files is deprecated and will no longer work in the near future. If you are using a Homebrew or Linuxbrew Python, please see discussion at https://github.com/Homebrew/homebrew-core/issues/76621 Requirement already satisfied: fake_useragent in /usr/local/lib/python3.9/site-packages (0.1.11) DEPRECATION: Configuring installation scheme with distutils config files is deprecated and will no longer work in the near future. If you are using a Homebrew or Linuxbrew Python, please see discussion at https://github.com/Homebrew/homebrew-core/issues/76621 Requirement already satisfied: matplotlib_venn_wordcloud in /usr/local/lib/python3.9/site-packages (0.2.5) Requirement already satisfied: matplotlib-venn in /usr/local/lib/python3.9/site-packages (from matplotlib_venn_wordcloud) (0.11.6) Requirement already satisfied: wordcloud in /usr/local/lib/python3.9/site-packages (from matplotlib_venn_wordcloud) (1.8.1) Requirement already satisfied: numpy in /usr/local/lib/python3.9/site-packages (from matplotlib_venn_wordcloud) (1.21.2) Requirement already satisfied: matplotlib in /usr/local/lib/python3.9/site-packages (from matplotlib_venn_wordcloud) (3.4.3) Requirement already satisfied: python-dateutil>=2.7 in /usr/local/lib/python3.9/site-packages (from matplotlib->matplotlib_venn_wordcloud) (2.8.2) Requirement already satisfied: pyparsing>=2.2.1 in /usr/local/lib/python3.9/site-packages (from matplotlib->matplotlib_venn_wordcloud) (2.4.7) Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.9/site-packages (from matplotlib->matplotlib_venn_wordcloud) (0.10.0) Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.9/site-packages (from matplotlib->matplotlib_venn_wordcloud) (1.3.2) Requirement already satisfied: pillow>=6.2.0 in /usr/local/lib/python3.9/site-packages (from matplotlib->matplotlib_venn_wordcloud) (8.3.2) Requirement already satisfied: six in /usr/local/lib/python3.9/site-packages (from cycler>=0.10->matplotlib->matplotlib_venn_wordcloud) (1.16.0) Requirement already satisfied: scipy in /usr/local/lib/python3.9/site-packages (from matplotlib-venn->matplotlib_venn_wordcloud) (1.7.1) DEPRECATION: Configuring installation scheme with distutils config files is deprecated and will no longer work in the near future. If you are using a Homebrew or Linuxbrew Python, please see discussion at https://github.com/Homebrew/homebrew-core/issues/76621 Requirement already satisfied: wordcloud in /usr/local/lib/python3.9/site-packages (1.8.1) Requirement already satisfied: pillow in /usr/local/lib/python3.9/site-packages (from wordcloud) (8.3.2) Requirement already satisfied: matplotlib in /usr/local/lib/python3.9/site-packages (from wordcloud) (3.4.3) Requirement already satisfied: numpy>=1.6.1 in /usr/local/lib/python3.9/site-packages (from wordcloud) (1.21.2) Requirement already satisfied: pyparsing>=2.2.1 in /usr/local/lib/python3.9/site-packages (from matplotlib->wordcloud) (2.4.7) Requirement already satisfied: python-dateutil>=2.7 in /usr/local/lib/python3.9/site-packages (from matplotlib->wordcloud) (2.8.2) Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.9/site-packages (from matplotlib->wordcloud) (1.3.2) Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.9/site-packages (from matplotlib->wordcloud) (0.10.0) Requirement already satisfied: six in /usr/local/lib/python3.9/site-packages (from cycler>=0.10->matplotlib->wordcloud) (1.16.0) DEPRECATION: Configuring installation scheme with distutils config files is deprecated and will no longer work in the near future. If you are using a Homebrew or Linuxbrew Python, please see discussion at https://github.com/Homebrew/homebrew-core/issues/76621 Requirement already satisfied: pillow in /usr/local/lib/python3.9/site-packages (8.3.2) DEPRECATION: Configuring installation scheme with distutils config files is deprecated and will no longer work in the near future. If you are using a Homebrew or Linuxbrew Python, please see discussion at https://github.com/Homebrew/homebrew-core/issues/76621 Requirement already satisfied: wikipedia in /usr/local/lib/python3.9/site-packages (1.4.0) Requirement already satisfied: beautifulsoup4 in /usr/local/lib/python3.9/site-packages (from wikipedia) (4.10.0) Requirement already satisfied: requests<3.0.0,>=2.0.0 in /usr/local/lib/python3.9/site-packages (from wikipedia) (2.26.0) Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.9/site-packages (from requests<3.0.0,>=2.0.0->wikipedia) (3.2) Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.9/site-packages (from requests<3.0.0,>=2.0.0->wikipedia) (2021.5.30) Requirement already satisfied: charset-normalizer~=2.0.0 in /usr/local/lib/python3.9/site-packages (from requests<3.0.0,>=2.0.0->wikipedia) (2.0.6) Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/local/lib/python3.9/site-packages (from requests<3.0.0,>=2.0.0->wikipedia) (1.26.7) Requirement already satisfied: soupsieve>1.2 in /usr/local/lib/python3.9/site-packages (from beautifulsoup4->wikipedia) (2.2.1) DEPRECATION: Configuring installation scheme with distutils config files is deprecated and will no longer work in the near future. If you are using a Homebrew or Linuxbrew Python, please see discussion at https://github.com/Homebrew/homebrew-core/issues/76621 Requirement already satisfied: matplotlib in /usr/local/lib/python3.9/site-packages (3.4.3) Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.9/site-packages (from matplotlib) (0.10.0) Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.9/site-packages (from matplotlib) (1.3.2) Requirement already satisfied: python-dateutil>=2.7 in /usr/local/lib/python3.9/site-packages (from matplotlib) (2.8.2) Requirement already satisfied: pyparsing>=2.2.1 in /usr/local/lib/python3.9/site-packages (from matplotlib) (2.4.7) Requirement already satisfied: numpy>=1.16 in /usr/local/lib/python3.9/site-packages (from matplotlib) (1.21.2) Requirement already satisfied: pillow>=6.2.0 in /usr/local/lib/python3.9/site-packages (from matplotlib) (8.3.2) Requirement already satisfied: six in /usr/local/lib/python3.9/site-packages (from cycler>=0.10->matplotlib) (1.16.0) DEPRECATION: Configuring installation scheme with distutils config files is deprecated and will no longer work in the near future. If you are using a Homebrew or Linuxbrew Python, please see discussion at https://github.com/Homebrew/homebrew-core/issues/76621 Requirement already satisfied: numpy in /usr/local/lib/python3.9/site-packages (1.21.2) DEPRECATION: Configuring installation scheme with distutils config files is deprecated and will no longer work in the near future. If you are using a Homebrew or Linuxbrew Python, please see discussion at https://github.com/Homebrew/homebrew-core/issues/76621 Requirement already satisfied: dataframe-image in /usr/local/lib/python3.9/site-packages (0.1.1) Requirement already satisfied: beautifulsoup4 in /usr/local/lib/python3.9/site-packages (from dataframe-image) (4.10.0) Requirement already satisfied: nbconvert>=5 in /usr/local/lib/python3.9/site-packages (from dataframe-image) (6.2.0) Requirement already satisfied: requests in /usr/local/lib/python3.9/site-packages (from dataframe-image) (2.26.0) Requirement already satisfied: pandas>=0.24 in /usr/local/lib/python3.9/site-packages (from dataframe-image) (1.3.3) Requirement already satisfied: matplotlib>=3.1 in /usr/local/lib/python3.9/site-packages (from dataframe-image) (3.4.3) Requirement already satisfied: aiohttp in /usr/local/lib/python3.9/site-packages (from dataframe-image) (3.7.4.post0) Requirement already satisfied: numpy>=1.16 in /usr/local/lib/python3.9/site-packages (from matplotlib>=3.1->dataframe-image) (1.21.2) Requirement already satisfied: pillow>=6.2.0 in /usr/local/lib/python3.9/site-packages (from matplotlib>=3.1->dataframe-image) (8.3.2) Requirement already satisfied: pyparsing>=2.2.1 in /usr/local/lib/python3.9/site-packages (from matplotlib>=3.1->dataframe-image) (2.4.7) Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.9/site-packages (from matplotlib>=3.1->dataframe-image) (1.3.2) Requirement already satisfied: python-dateutil>=2.7 in /usr/local/lib/python3.9/site-packages (from matplotlib>=3.1->dataframe-image) (2.8.2) Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.9/site-packages (from matplotlib>=3.1->dataframe-image) (0.10.0) Requirement already satisfied: six in /usr/local/lib/python3.9/site-packages (from cycler>=0.10->matplotlib>=3.1->dataframe-image) (1.16.0) Requirement already satisfied: pandocfilters>=1.4.1 in /usr/local/lib/python3.9/site-packages (from nbconvert>=5->dataframe-image) (1.5.0) Requirement already satisfied: jupyter-core in /usr/local/lib/python3.9/site-packages (from nbconvert>=5->dataframe-image) (4.8.1) Requirement already satisfied: mistune<2,>=0.8.1 in /usr/local/lib/python3.9/site-packages (from nbconvert>=5->dataframe-image) (0.8.4) Requirement already satisfied: pygments>=2.4.1 in /usr/local/lib/python3.9/site-packages (from nbconvert>=5->dataframe-image) (2.10.0) Requirement already satisfied: bleach in /usr/local/lib/python3.9/site-packages (from nbconvert>=5->dataframe-image) (4.1.0) Requirement already satisfied: testpath in /usr/local/lib/python3.9/site-packages (from nbconvert>=5->dataframe-image) (0.5.0) Requirement already satisfied: nbformat>=4.4 in /usr/local/lib/python3.9/site-packages (from nbconvert>=5->dataframe-image) (5.1.3) Requirement already satisfied: nbclient<0.6.0,>=0.5.0 in /usr/local/lib/python3.9/site-packages (from nbconvert>=5->dataframe-image) (0.5.4) Requirement already satisfied: jupyterlab-pygments in /usr/local/lib/python3.9/site-packages (from nbconvert>=5->dataframe-image) (0.1.2) Requirement already satisfied: entrypoints>=0.2.2 in /usr/local/lib/python3.9/site-packages (from nbconvert>=5->dataframe-image) (0.3) Requirement already satisfied: jinja2>=2.4 in /usr/local/lib/python3.9/site-packages (from nbconvert>=5->dataframe-image) (3.0.1) Requirement already satisfied: traitlets>=5.0 in /usr/local/lib/python3.9/site-packages (from nbconvert>=5->dataframe-image) (5.1.0) Requirement already satisfied: defusedxml in /usr/local/lib/python3.9/site-packages (from nbconvert>=5->dataframe-image) (0.7.1) Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.9/site-packages (from jinja2>=2.4->nbconvert>=5->dataframe-image) (2.0.1) Requirement already satisfied: jupyter-client>=6.1.5 in /usr/local/lib/python3.9/site-packages (from nbclient<0.6.0,>=0.5.0->nbconvert>=5->dataframe-image) (7.0.3) Requirement already satisfied: nest-asyncio in /usr/local/lib/python3.9/site-packages (from nbclient<0.6.0,>=0.5.0->nbconvert>=5->dataframe-image) (1.5.1) Requirement already satisfied: pyzmq>=13 in /usr/local/lib/python3.9/site-packages (from jupyter-client>=6.1.5->nbclient<0.6.0,>=0.5.0->nbconvert>=5->dataframe-image) (22.3.0) Requirement already satisfied: tornado>=4.1 in /usr/local/lib/python3.9/site-packages (from jupyter-client>=6.1.5->nbclient<0.6.0,>=0.5.0->nbconvert>=5->dataframe-image) (6.1) Requirement already satisfied: jsonschema!=2.5.0,>=2.4 in /usr/local/lib/python3.9/site-packages (from nbformat>=4.4->nbconvert>=5->dataframe-image) (3.2.0) Requirement already satisfied: ipython-genutils in /usr/local/lib/python3.9/site-packages (from nbformat>=4.4->nbconvert>=5->dataframe-image) (0.2.0) Requirement already satisfied: pyrsistent>=0.14.0 in /usr/local/lib/python3.9/site-packages (from jsonschema!=2.5.0,>=2.4->nbformat>=4.4->nbconvert>=5->dataframe-image) (0.18.0) Requirement already satisfied: setuptools in /usr/local/lib/python3.9/site-packages (from jsonschema!=2.5.0,>=2.4->nbformat>=4.4->nbconvert>=5->dataframe-image) (57.4.0) Requirement already satisfied: attrs>=17.4.0 in /usr/local/lib/python3.9/site-packages (from jsonschema!=2.5.0,>=2.4->nbformat>=4.4->nbconvert>=5->dataframe-image) (21.2.0) Requirement already satisfied: pytz>=2017.3 in /usr/local/lib/python3.9/site-packages (from pandas>=0.24->dataframe-image) (2021.1) Requirement already satisfied: chardet<5.0,>=2.0 in /usr/local/lib/python3.9/site-packages (from aiohttp->dataframe-image) (4.0.0) Requirement already satisfied: async-timeout<4.0,>=3.0 in /usr/local/lib/python3.9/site-packages (from aiohttp->dataframe-image) (3.0.1) Requirement already satisfied: typing-extensions>=3.6.5 in /usr/local/lib/python3.9/site-packages (from aiohttp->dataframe-image) (3.10.0.2) Requirement already satisfied: yarl<2.0,>=1.0 in /usr/local/lib/python3.9/site-packages (from aiohttp->dataframe-image) (1.6.3) Requirement already satisfied: multidict<7.0,>=4.5 in /usr/local/lib/python3.9/site-packages (from aiohttp->dataframe-image) (5.1.0) Requirement already satisfied: idna>=2.0 in /usr/local/lib/python3.9/site-packages (from yarl<2.0,>=1.0->aiohttp->dataframe-image) (3.2) Requirement already satisfied: soupsieve>1.2 in /usr/local/lib/python3.9/site-packages (from beautifulsoup4->dataframe-image) (2.2.1) Requirement already satisfied: webencodings in /usr/local/lib/python3.9/site-packages (from bleach->nbconvert>=5->dataframe-image) (0.5.1) Requirement already satisfied: packaging in /usr/local/lib/python3.9/site-packages (from bleach->nbconvert>=5->dataframe-image) (21.0) Requirement already satisfied: charset-normalizer~=2.0.0 in /usr/local/lib/python3.9/site-packages (from requests->dataframe-image) (2.0.6) Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/local/lib/python3.9/site-packages (from requests->dataframe-image) (1.26.7) Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.9/site-packages (from requests->dataframe-image) (2021.5.30) DEPRECATION: Configuring installation scheme with distutils config files is deprecated and will no longer work in the near future. If you are using a Homebrew or Linuxbrew Python, please see discussion at https://github.com/Homebrew/homebrew-core/issues/76621 Requirement already satisfied: nltk in /usr/local/lib/python3.9/site-packages (3.6.3) Requirement already satisfied: click in /usr/local/lib/python3.9/site-packages (from nltk) (8.0.1) Requirement already satisfied: tqdm in /usr/local/lib/python3.9/site-packages (from nltk) (4.62.3) Requirement already satisfied: joblib in /usr/local/lib/python3.9/site-packages (from nltk) (1.0.1) Requirement already satisfied: regex in /usr/local/lib/python3.9/site-packages (from nltk) (2021.9.24) DEPRECATION: Configuring installation scheme with distutils config files is deprecated and will no longer work in the near future. If you are using a Homebrew or Linuxbrew Python, please see discussion at https://github.com/Homebrew/homebrew-core/issues/76621 Requirement already satisfied: textblob in /usr/local/lib/python3.9/site-packages (0.15.3) Requirement already satisfied: nltk>=3.1 in /usr/local/lib/python3.9/site-packages (from textblob) (3.6.3) Requirement already satisfied: regex in /usr/local/lib/python3.9/site-packages (from nltk>=3.1->textblob) (2021.9.24) Requirement already satisfied: joblib in /usr/local/lib/python3.9/site-packages (from nltk>=3.1->textblob) (1.0.1) Requirement already satisfied: tqdm in /usr/local/lib/python3.9/site-packages (from nltk>=3.1->textblob) (4.62.3) Requirement already satisfied: click in /usr/local/lib/python3.9/site-packages (from nltk>=3.1->textblob) (8.0.1) DEPRECATION: Configuring installation scheme with distutils config files is deprecated and will no longer work in the near future. If you are using a Homebrew or Linuxbrew Python, please see discussion at https://github.com/Homebrew/homebrew-core/issues/76621 Requirement already satisfied: vaderSentiment in /usr/local/lib/python3.9/site-packages (3.3.2) Requirement already satisfied: requests in /usr/local/lib/python3.9/site-packages (from vaderSentiment) (2.26.0) Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.9/site-packages (from requests->vaderSentiment) (3.2) Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/local/lib/python3.9/site-packages (from requests->vaderSentiment) (1.26.7) Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.9/site-packages (from requests->vaderSentiment) (2021.5.30) Requirement already satisfied: charset-normalizer~=2.0.0 in /usr/local/lib/python3.9/site-packages (from requests->vaderSentiment) (2.0.6) running author: Aristotle page title: Aristotle found page for Aristotle, it is Aristotle running author: Beauvoir page title: Simone de Beauvoir found page for Beauvoir, it is Simone de Beauvoir running author: Berkeley page title: George Berkeley found page for Berkeley, it is George Berkeley running author: Davis page title: Angela Davis found page for Davis, it is Angela Davis running author: Deleuze page title: Gilles Deleuze found page for Deleuze, it is Gilles Deleuze running author: Derrida page title: Derrida CANT FIND page for Derrida running author: Descartes page title: René Descartes found page for Descartes, it is René Descartes running author: Epictetus page title: Epictetus found page for Epictetus, it is Epictetus running author: Fichte page title: Johann Gottlieb Fichte found page for Fichte, it is Johann Gottlieb Fichte running author: Foucault page title: Michel Foucault found page for Foucault, it is Michel Foucault running author: Hegel page title: Georg Wilhelm Friedrich Hegel found page for Hegel, it is Georg Wilhelm Friedrich Hegel running author: Heidegger page title: Heideggerian terminology found page for Heidegger, it is Heideggerian terminology running author: Hume page title: David Hume found page for Hume, it is David Hume running author: Husserl page title: Edmund Husserl found page for Husserl, it is Edmund Husserl running author: Kant page title: Immanuel Kant found page for Kant, it is Immanuel Kant running author: Keynes page title: John Maynard Keynes found page for Keynes, it is John Maynard Keynes running author: Kripke page title: Saul Kripke found page for Kripke, it is Saul Kripke running author: Leibniz page title: Gottfried Wilhelm Leibniz found page for Leibniz, it is Gottfried Wilhelm Leibniz running author: Lenin page title: Vladimir Lenin found page for Lenin, it is Vladimir Lenin running author: Lewis page title: David Lewis (philosopher) found page for Lewis, it is David Lewis (philosopher) running author: Locke page title: John Locke found page for Locke, it is John Locke running author: Malebranche cant even find title for Malebranche running author: Marcus Aurelius page title: Marcus Aurelius found page for Marcus Aurelius, it is Marcus Aurelius running author: Marx page title: Karl Marx found page for Marx, it is Karl Marx running author: Merleau-Ponty page title: Maurice Merleau-Ponty found page for Merleau-Ponty, it is Maurice Merleau-Ponty running author: Moore page title: A. W. Moore (philosopher) found page for Moore, it is A. W. Moore (philosopher) running author: Nietzsche page title: Friedrich Nietzsche found page for Nietzsche, it is Friedrich Nietzsche running author: Plato page title: Plato found page for Plato, it is Plato running author: Popper cant even find title for Popper running author: Quine cant even find title for Quine running author: Ricardo cant even find title for Ricardo running author: Russell page title: Russell's teapot CANT FIND page for Russell running author: Smith page title: Adam Smith found page for Smith, it is Adam Smith running author: Spinoza page title: Baruch Spinoza found page for Spinoza, it is Baruch Spinoza running author: Wittgenstein page title: Ludwig Wittgenstein found page for Wittgenstein, it is Ludwig Wittgenstein running author: Wollstonecraft page title: Mary Wollstonecraft found page for Wollstonecraft, it is Mary Wollstonecraft cant find wikipedia page for author Ricardo
The data contain 13 schools of philosophy, they are:
display_descriptive_counts()
| authors | titles | records | schools | |
|---|---|---|---|---|
| sex | ||||
| female | 3 | 3 | 18635 | 1 |
| male | 32 | 55 | 339083 | 12 |
There is far more data for men than women. There are only 3 female philosophers in the data and there are 32 men authors. The men make up about 95% of the data. Also, the male authors collectively have works in all but one school - feminism. The female authors only write in school - feminism.
Below is a timeline of each authors first publication separated by sex.
plot_timelines()
We can see that women came along to the world of philosophy (in this dataset) much later than men. Men have philiosophers dating all the way back to 350 BC, Plato. Although men have published work much earlier than women, only 4 authors have works published before the 1600s. Most men have publish dates after the 1600s and a lot between after the 1900s. All women's work was published after the 1750s.
Below is a vizualation of the average sentiment scores by sex for each data source (philosphy text data, wikipedia bio data, and google search data). We are interested in whether one data source or sex has a dramatically different sentiment score compared to others.
First, we take a look at sentiment scores from the philosophy data (figure above). These scores will reflect the average authors' sentiment in their own works.
plot_senti_scores_data()
We can see that women have slightly less positive scores compared to men. Though, in general, everything in this chart is pretty neutral.
Next, we take a look at the sentiment scores for the wikipedia data. We expect this source to be the most objective since it is just a biography.
plot_senti_scores_wiki()
It is very interesting that the simple and vader sentiment scores are much lower for females as compared to males. We expected the wikipedia biographies to be the most objective data source of all so it is interesting to see this large difference. This can be due to the fact that the men's biogrpahies have more positive words such as "pioneer", since they are relatively more foundational to the subject of philosophy compared to women.
Lastly, we take a look at sentiment scores from the google search data (below). This data should logically be the most subjective of them all since we searched google for reviews. The scores will probably reflect the author's opinion on the article.
plot_senti_scores_google()
The vader sentiment score is about the same for men and women. In the textblob and simple sentiment scores the women have slightly more positive scores than men. This leads us to believe that women's work in philosophy is just as (or more) positively received as men's work. The neutral points can also be due to positive and negative scores cancelling eachother out. Due to this, we will also look at boxplots for the data above to see distribution of the scores below.
boxplot_data()
boxplot_wiki()
boxplot_google()
It looks like the vader score is slightly unreliable and simple and textblob sentiment scores are more aligned with eachother. As expected, for almost all plots, we see that there is a wide range of sentiment scores. Thus, the sentiments aggregated tend to be pretty netural. However, it is still useful to compare absolute difference between men and women as I did above.
Word clouds are a great tool to see which words are most popular in texts. The larger words are used more commonly. It is important to note that all works are combined across authors by gender. This is meant to serve as an overall representation of the authors' works, wiki bios, and google reviews. Word clouds are like the stars - the longer you look, the more you'll see!
We first compare the word clouds of the philosophy data for men and women.
wordcloud_data()
It is interesting to see that in the men's works objective words such as "object", "reason", "truth", and "fact" and firm words such as "must" are used very often. The women's works uses more subjective words a lot such as "situation", "sometimes" and soft words such as "lover". This can lead us to believe that women are writing from more of a subjective perspective. It is interesting that we see man is mentioned a lot in both sexes' works but women is only mentioned a lot in the women's work.
Next we look at word clouds for the authors' wikipedia pages below.
wordcloud_wikipedia()
These word clouds look much different compared to the philosophy data word clouds. Both word clouds have words that you'd typically see in a biography: countries of origin, "first" if the author is doing something novel or their first work, "book", author names, etc.
wordcloud_google()
The google word clouds look pretty similar to the wikipedia word clouds.
Lastly, we'll take a look at the same word clouds but in venn diagrams. It will be interesting to see the overlap between sexes by source.
First, we'll look at the philosophy data.
venn_wordcloud_data()
It looks like there's a lot of women's words are found in men's text but the opposite is not true. One word that appears only in women's texts is "statistic", which is very interesting. This can lead us to believe that females rely on statistics in their texts more than men. However, to that point, we do see that the word quantitatively appears in both texts quite often.
Next, we'll look at the wikipedia data.
venn_wordcloud_wiki()
In the wikipedia data we see the same rleationship as in the philosophy data - there's a lot of women's words are found in men's text but the opposite is not true. In this we see some words that are in only females pages and not mens' that we didn't see before (or as big) such as "motherhood", "fuitility". New words appearing a lot (large and in the middle) compared to the philosophy data are "feminisim". It is interesting that we don't see feminism overlap in the philosophy data above but we see it overlap here in the wikipedia pages. A word that we see in the mens' wikipedia pages but not in the women's is "science" - this is interesting that we don't see this same pattern appear in the texts. This can lead us to believe that women's philosophical works aren't perceived as scientific as men's works are. This is especially interesting consider that in the previous venn diagram, we only saw the word "statistic" in women's texts.
Lastly, we will look at the word cloud venn diagram for the google search texts. We hypothesized that google would be the most subjective data source of them all.
venn_wordcloud_google()
We see a bunch of new patterns here. Of interst are the extremely positive and negative words that we didn't see before. This is expected since these are subjective articles reviewing philosophers. Example of extrememly negative words in men's reviews are "lazy" and "bitter" and for women "delusional", "tarnished", "oversexed". There are also positive words used in both men and women's reviews as well. Overlapping positive words in reviews include "masterpieces", and "approve".
I conclude that men write more matter-of-factly while women write more from their perspective and subjectively. Wikipedia, a supposed source of truth, suprisingly has a much more positive average sentiment for men's bio pages as opposed to women's pages. We also see that wikipedia describes men's work as more scientific as opposed to women's even though in their texts they both use the word quantitatively and only women use the word statistic. Perhaps there is bias in Wikipedia's writers. The google dataset didn't have huge takeaways from the sentiment analysis. Men and women's works have roughly the same sentiment from google reviews. In the word clouds we do see strong sentiments in both directions.
!jupyter nbconvert --to html men_and_women_of_philosophy.ipynb
[NbConvertApp] Converting notebook men_and_women_of_philosophy.ipynb to html [NbConvertApp] Writing 833885 bytes to men_and_women_of_philosophy.html